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Abstract 

An efficient, low-complexity, soft-information detector for multiple input multiple output channels and lattice 
constellations was devised, based on Tanner graph representations of lattices. Due to the coding gain associated with 
a lattice, structural relations exist between certain lattice points, which can be associated via an equivalence relation 
for detection purposes. The algorithm can generate both total and extrinsic a posteriori probability at detector's output. 
The step-back artifact (of traditional sphere decoders) is eliminated. The algorithm applies to general lattices and 
enables iterative receivers; it was tested in the case of uncoded transmission for a superorthogonal constellation in 
two scenarios. In quasistatic (block) fading it was found to achieve maximum likelihood performance even with one 
'surviving' label (out of six); in independent fading with coordinate (component) interleaving and iterations between 
equalization and detection, it performs close to interference-free transmission. The coordinate interleaved scenario 
outperforms former despite absence of forward error correction coding. 

Index Terms 

Belief propagation on a lattice, sphere decoder, soft information lattice detection, closest point search in lattices, 
Tanner graph, MIMO, iterations detector-decoder. 

I. Introduction 

Multiple input multiple output (MIMO) transmission has emerged as a strong scenario for future high-speed 
wireless communications due to the large capacity potential of MIMO channels. Space-time codes that exploit both 
spatial diversity and time diversity have been widely proposed as MIMO modulation in the past decade to achieve 
reliable transmission. 

Recently, the importance of lattice MIMO constellations in constructing space-time lattice codes was recognized 
by El-Gamal et al. [10] from a diversity-multiplexing tradeoff perspective. Superorthogonal space-time codes — 
reported in [2], then in [3], [4], [5], [6] (where they were dubbed 'superorthogonal') — are in fact lattice space time 
codes; the lattice structure inherent to the superorthogonal construction was noted by Ionescu and Yan [7, Section III] 
(Example 2 in the sequel offers more detail). As lattices, such constellations lend themselves to efficient detection 
algorithms, e.g. sphere decoding. Classic sphere decoding (see [15] and references therein) use hard decision, along 
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with a step-back provision; soft-output versions have been imagined, but rely on a list of important candidates, and 
retain the step-back provision. In [1], lattice partitioning is used to divide the infinite lattice into a finite number of 
cosets. Each coset is then labeled by a codeword of a finite Abelian group block code, called a label code. In [13], 
a Tanner graph (TG) representation for the label code was developed; this opens an opportunity for using belief 
propagation on the lattice labels. 

The sequel takes a novel, qualitatively different approach to soft-output closest point search in lattices, via a form 
of belief propagation on a lattice. Due to the coding gain associated with a lattice, structural relations exist between 
certain lattice points, which can be associated via an equivalence relation for detection purposes. The algorithm 
can generate both total and extrinsic a posteriori probability (APP) at the detector's output. The step-back feature 
is eliminated. For each channel use, a filter bank for interference cancellation with minimum mean square error 
(IC-MMSE) is used to remove the channel effects. Then, a reduced-complexity lattice decoder based on TG lattice 
representation is proposed for computing total APP and extrinsic APP. The capability of calculating the extrinsic 
APP enables decoding schemes that iterate between detection and decoding. This novel lattice detection algorithm is 
applied to detecting superorthogonal space-time lattice codes [7] in quasistatic fading, and to a coordinate interleaved 
[18] scenario. The following notation is adhered to. Vectors are denoted by lowercase bold letters; denotes the 
z-th element of vector a. Matrices are denoted by uppercase bold letters. The i-th column vector and the ij-th 
element of a matrix, say A, are denoted by aj and a^, respectively. The superscripts T and H are used to denote 
transposition and complex conjugated transposition, respectively. 

II. Problem Definition and System Model 

Complex and real transmission models are described; a general formulation for lattice constellations for MIMO 
channels is then introduced, followed by two examples pertaining respectively to linear dispersion and superorthog- 
onal codes. 

A. Rayleigh flat fading MIMO channels 

Consider MIMO wireless transmission with Nt transmit antennas and N r receive antennas in Rayleigh flat 
fading. The channel coefficients are assumed to be constant over a block of T MIMO channel uses and change 
independently from block to block. The transmission of each block is then given by 

Y = ^/TJW t SH + N (1) 

where Y £ C TxNr ,H £ C NtXNr ,S £ A TxNt and N £ C TxNr are arrays of received signals, channel gain 
coefficients, transmitted signals and additive noises, respectively. The elements of TV are i.i.d. zero-mean complex- 
valued Gaussian random variables with variance No/2 per dimension, i.e., mj ~ CAf(0, No). The channel gain 
matrix H has elements hij ~ CAf(0, 1) representing the channel gain coefficients between the i-th transmit antenna 
and the j-th receive antenna, assumed pairwise independent. The array S describes the transmitted symbols chosen 
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from alphabet A 1 ; Sij € C is radiated from the j-th transmit antenna during the i-th channel use. By enforcing the 
power constraint 

E{i|S| 2 } <N t , (2) 

where ||| • ||| denotes the Euclidean matrix norm and E{-} denotes expectation, the average signal-to-noise (SNR) 
ratio per receive antenna is 1/Nq. 

It is important to note that (1) can accommodate various setups, which include the case T = 1 that allows for 
independent (rather than block) fading. Similarly, the arrays S may have a certain structure, e.g. they may represent 
space-time code matrices; or, they may simply be arrays of unrelated values obtained after interleaving the real 
coordinates of structured matrices (Section IV-B) then forming new complex valued arrays out of the scrambled 
coordinates. 



B. Equivalent real-valued transmission model 

Eq. (1) is the receive equation for the transmission of complex valued arrays from Nt transmit antennas during 
T MIMO channel uses. It is also convenient to introduce equivalent real-valued transmission models. To this end, 
define two isomorphisms from complex domain to real domain, I : C M i — ► R 2Mxl and </> : <C AIxN i — ► R 2A/7Vxl , 
as follows: 

1(a) d =l f [5R(a) T 3(a) T ] T , (3) 
4>{A) = [X(a 1 f---I(a N f] T , (4) 
where a £ c Mxl anc [ A = [ai ... a^v] £ C AIxN . The real-valued transmission model that is equivalent to (1) is 

y c = H c x + n c (5) 

Note that H c is a 



SR(H T )-%(H T ) 
3(i? T ) ${H T ) 



where y c = (j)(Y ), n c = (j){N L ), x = (j){S L ) and H c = I T € 
2N r T x 2N t T block-diagonal real channel matrix consisting of T identical diagonal replicas the same 2N r x 2N t 
matrix (It is the identity matrix of dimension T and ® denotes the Kronecker product). A similar model has been 
reported in [10]. 

In addition, define a new vector y = <fi(Y). By definition of </>, it can be seen that the vector y is some permutation 
7r of y c , since y and y° are isomorphisms, via <j>, of Y and its transpose Y T . One can obtain y from y c as follows: 

y = n(y c ) = n(H c x + n c )=n(H c )x + tt(ti c ) = Hx + n, (6) 

def 

where ir(H c ) = H denotes a row permutation of H c by n. 

'Different alphabets could be used on different transmit antennas, e.g. Aj could be used on the j-th transmit antenna; the alphabets Aj could 
differ, for example, when identical constellations are assigned with unequal powers to different transmit antennas. While this general case could 
be accommodated it is secondary in importance for the purpose of this work. 
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The real channel models (6) and (5) are both equivalent to the MIMO model in eq. (1), and can be used 
interchangeably. In the sequel, (6) will be preferred since it is consistent with the transmission model used in [7] — 
which is referenced in order to address certain important properties of super-orthogonal space-time codes used, in 
turn, to demonstrate the algorithm for finding the closest point in a lattice. 

C. Space-time lattice codes 

An m-dimensional real lattice A is a discrete additive subgroup of W n defined as A = {Bu : u £ Z" 1 } where 
the real matrix B of size m x m is the generator matrix of A [10]. A lattice code C(A, Uo, TZ) is the finite subset of 
the lattice translate A + uo inside some shaping region TZ, i.e., C(A, uo,TV) = { A + uq} n TZ, where TZ is a bounded 
region of R m [10]. A space-time coding scheme with a space-time code matrix set S, such that 4>(S T ) £ W n for 
all S £ S, is a lattice space-time code if the m-dimensional image of S via the isomorphism is a lattice code 
C(A, Uo,TV), i.e., <fi({S T }) = {{Bu : u £ M m } + Uo} n TZ. Many well-known space-time modulation schemes in 
the literature indeed can be treated as space-time lattice codes. Two important examples of space-time lattice codes 
are given below. 

Example 1: (Linear dispersion codes) A linear dispersion code [11] defines a mapping of a complex vector 
s = [so, si, ■ • ■ , sk~i] t to a T x N t complex matrix S as follows: 

s = El < = 1 ( s i p i + s i B Qi) (7) 

where {-P/j^g 1 , {Qi}^^ 1 are T X Nt complex matries. The linear dispersion code can be further rearranged as 

^ = E/=o 1 + (8) 

with Pi = Pi + Qi and Q t = iPi — iQ t . Let x = %(s); then one can express the linear dispersion code linearly 
in terms of x an d a set of matrices C = f {C/}^ -1 = j-Po, Pi, " > Pk-i, Qq? Q\i ' ' ' i Qa'-i} y i a 

s = E-fo^ 1 (9) 

where Cj is the z-th matrix of C. Consequently, the isomorphism of S 11 ^ via (f>, denoted x, is given by 

x d ^ ^) = E^f- 1 xMcJ) = r x (io) 

withr = [0(c o r ),-- - ^(4.!)]. 

It is clear from (10) that when the vector x is proportional to a vector of integers a linear dispersion code is a 
lattice code with generator matrix T; this is the case when s is from a particular modulation constellation such as 
PAM or QAM. In general, x is not an integer vector, e.g. when the elements of s are from a PSK constellation. 
However, if, by construction of the linear dispersion code, s is selected to be from a lattice A' then the points x 
are carved from the lattice A' via a shaping region TZ £ R m . That is, 

where A' = {Bu : u £ Z m } and B is the generator matrix of A', and the linear dispersion code is a lattice space- 
time code with generator matrix TB. One may find different pairs of lattice A' and shaping region TZ defining 



DRAFT 



February 1, 2008 



IONESCU AND ZHU 



5 



the same %s; the choice of A' and 1Z will influence the complexity of the corresponding decoder, as discussed in 
[13] (unless some basis reduction approach is used to process the generator matrix). The real transmission model 
becomes 

y = HTBu + n, (12) 

and is equivalent to using a lattice space-time code with generator matrix TB. 

Example 2: (Super orthogonal space-time lattice codes) A super-orthogonal space-time code is constructed [7] 
by expanding a (generalized) orthogonal design [8], which in turn is obtained as a linear combination of matrices 
similar to (7), (8), with expansion coefficients derived from a complex vector s; the difference from a linear 
dispersion code is that the latter matrices verify an additional constraint (see [7, eqs. (2), (3)]). A super-orthogonal 
space-time construction for T = 2, Nt = 2, and QPSK constellation, having thirty two codematrices, was described 
in [2], [3], [4], [5], [6]. A generic codematrix S can be expressed as [7] 2 



XI ± => x'l = and X 'i + => X i = 0,VZ; 



(13) 
(14) 



above, xi ar, d Xi(l = Oj 1,2,3) are either 1, —1, or and the nonzero values are real parts of complex elements 
from a complex QPSK constellation; the two sets of real coefficients \i an d x'i(l = 0, 1, 2, 3) are not simultaneously 
nonzero, i.e. either all \i & or all Xz s vanish. As discussed in [7], the super-orthogonal matrix codebook is embedded 
into an 8-dimensional real vector space obtained as the direct sum of two 4-dimensional real vector spaces 3 . The 
two sets of matrices Ci and C\ are basis matrices in the component vector spaces that form the direct sum: 
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The isomorphism of a super-orthogonal space-time codematrix S, denoted by x 

x = cj>(S T ) = T Xb 



(S T ), is given by 



(15) 
(16) 

(17) 



where x e = [Xo,Xi, ' ' ■ ,Xa,Xo',Xi, ' ' ' .X3'] T = lx T x' T ] T e 



is a direct sum of two 4-dimensional vectors, 



and T = [Ti T 2 ] is a 8 x 8 real matrix with T 1 = 0(Cq ) 



and To = 



respectively. It also follows from [7] that T is proportional with a unitary matrix via IT H = 21 



2 Definition (3) of the isomorphism X from a complex vector to a real vector differs slightly from [7], where it was defined by interlacing 
the real and imaginary parts; i.e., in [7], if s = [zi, . . . , zk] T 6 then X(s) = x =^ [^i- 2 !}! ■ • • i ^{zif }, ^{ Z A'}] T — rather 

than keeping the real (and imaginary) parts together as done in eq. (3). This is the reason for swapping the second and third matrices in eqs. 
(15), (16) relative to [7, Sec. III]. 

3 In the superorthogonal construction the two 4-dimensional components of the direct sum are reflection symmetries (around origin) of one 
another [9]. 
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Because s takes values from a QPSK constellation {±1 ± j}, j = y — 1, the nonzero realizations of either of the 
two vectors x> x' are the sixteen 4-dimensional real vectors with elements ±1; that is, either = [x T [0 0] T ] T 
or Xe -[[0 0] T X ' T ] T 

Since % e £ Z 8 , the vector x is recognized to be from some lattice A with generator matrix T, via (17). In 
the sequel it is helpful to further recognize that x@ itself is from a direct sum a two 4-dimensional checkerboard 

def 

lattices. Indeed, consider the lattice L = D4 © D4, i.e., a point [Ai A2 . . . As] in D4 D4 has the property that 

[Ai A2 A3 A4], [A5 A6 A7 As] are from D4. Let [di d 2 d^ d±\ denote a point in the second shell of D4, i.e. satisfying 

Si=i $ = 4- There are twenty four points in the second shell of D4, of which exactly sixteen will satisfy \di\ = 1; 

B 

denote this set by T>. If B is the 4x4 generator matrix of D4 then D4 D4 has generator matrix q 4 4 ^g> 4 ■ 
Then L = L\ © L 2 , where L\ and L 2 have generator matrices [B 04x4] and respectively [04 X 4-B]- Both L\ and 
L 2 are isomorphic with D4. L4 contains the sixteen points in the set j [c T [0 0] T ] T |c £ pj, and L 2 contains 
the sixteen points in the set {[0 0] T c T ] T |ceI?}. Note that the nonzero realizations of either of the vectors v, 
x' are the sixteen points in the second shell of D4 having unit magnitude real coordinates; thereby, A = Ai © A 2 
where A; is isomorphic with Li, i = 1,2, and x® is from a direct sum a two 4-dimensional checkerboard lattices. 
A generator matrix for a checkerboard lattice D4 is, e.g., the matrix B in (35). 
It follows from (17) that x = 4>(S T ) can be written as 



B 04x4 
04x4 B 



u, u=[ui ... U4] T G Z 4 (18) 



where B is the generator matrix of the checkerboard lattice D4, given in (35). Thereby, x can be viewed as being 

B 4X 41 



from a lattice with generator matrix T 



04x4 B 



[TiB T 2 B). 



For a super-orthogonal space-time lattice code the real equivalent transmission model in eq. (6) becomes 

y = Hx + n = HTx B +n = H 9 x® +n = H®Bu + n (19) 

def 

where the second equality is obtained according to (17), and Hq = HT. Note that in [7] the transmission model 
for the same super-orthogonal space-time code is (see footnote 2): 

y© = G©x©+n©- (20) 



It can be verified that Ga 



HTi 04x4 
04x4 HT 2 



Furthermore, the matrix G© was shown in [7] to be proportional with 



a unitary matrix, i.e., G©G© H = ai. Denote = HTi and H'L = -HT 2 . Then, H^L, k = 1, 2, is unitary up 



to a scalar, i.e. 

H**H%=aI, k = 1,2. (21) 



III. Reduced search soft-output detector for closest point search in lattices 

While lumping a channel matrix with some (equivalent) generator matrix — as in (19) — might be tempting, the 
new lattice having generator matrix HT or HTB may have labels with very large label coordinate alphabets (see 
Section III-B, [13]) for random H — unless some form of basis reduction can be devised. It is more straightforward 



DRAFT 



February 1, 2008 



IONESCU AND ZHU 



7 



to illustrate the concept by removing the effect of the channel matrix H via some equalization step, then dealing 
with the underlying lattice separately. This is the approach taken in the sequel. 

A novel soft-information detection algorithm for lattice space-time constellations is introduced below. Detection 
is performed in two stages: linear minimum mean square error (LMMSE) filtering, and belief propagation (BP) 
on a lattice. In the first stage, a finite impulse response (FIR) LMMSE filter bank is used to remove the effect of 
the channel; the lattice redundancy is subsequently exploited by a novel lattice detector based on a Tanner graph 
representation of the lattice. 



A. MMSE soft equalizer with interference cancellation 

The equivalent real transmission model is given in (6). The goal of the MMSE soft equalizer is to remove the 
effect of the channel H, and provide a soft estimate of each component x, of x so as to minimize the interference 
due to other coordinates {a;;}^'^, and to noise n. For the i-th branch, the soft estimate, denoted as Xi, is given 
by 

±i = mjy (22) 

with the i-th FIR filter ra; being 

m; = are min E {\\xi — m T 'v|| 2 ) (23) 

subject to the unit power constraint 

mjh, = 1. (24) 

This power constraint mitigates the attenuation effect on the desired signal due to the filtering. The optimal solution 
is [12] 

m L = ml H T — R ^h t (25) 

R hi 

where R = ^-/fff H + ^-1 is the covariance matrix of y, = ■^^R~ 1 h i is the optimal solution for (23) 
without power constraint, and Oj = 1 — mjhi. The MSE of = E — JT&j T y|| 2 } of the i-th branch is 

of = -5- - (m c i ) T Rm'i + T ai . (26) 
2N t 1 h, T R ~ l hi 



If detection and decoding can be performed iteratively, then soft information about x can be fed back from the 
FEC decoder and made available to the filter bank in the form of probabilities of valid realizations of transmitted 
vectors x, or its elements xf, i.e. either at the vector level x, {Pr(a; = 4>(C T )) 4>(C T £ C(A, uo, 71) }, or at the 
coordinate level — e.g. in the case when coordinate interleaving [18] is used to scramble the coordinates of several 
vectors x prior to trasmission. In the latter case the structure present in the different multidimensional lattice points 
is destroyed during transmission through the channel; not only does this mean that the coordinate probabilities 
supplied by the decoder have to be unscrambled before being fed back to the LMMSE filter for interference 
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cancellation (IC — see Fig. 4), but the performance can be improved (over the non-interleaved scenario) even in an 
uncoded system (see Section IV-B). 

An iterative receiver aims at iteratively canceling the interference prior to filtering by forming as soft interference 
estimator in one of two ways: 

1) Vector level feedback: 

*ic = £, CC T) 6 e(A,ii„,tt) 0(C T )Pr(o ; = 0(C T )) (27) 

2) Coordinate level feedback: If /Q is the ith coordinate alphabet, the average interference value at position i is 

xia,i = Ecgk, CPr(^ = 0- (28) 

Let x lc j denote the vector obtained by setting the i-th element of Xic to zero, i.e., x lc ;• = [••• , aric i-i, 0, Xic,i+i) • • • ] T , 
the interference cancellation is performed for the i-th branch 

Vi=V- Hx \c^ ( 29 > 
and the soft estimate Xi of the i-th branch after IC is 

%i = ™Jy t (30) 

subject to a unit power constraint like (24). The estimation (30) is referred to IC-MMSE. The covariance matrix 
of denoted as Ric,h is 

R IC ,i = HQ ICti H H + ^I (31) 

with Q lC l = 2^/-diag{£C IC J }diag{a; IC I }. Substituting Ri C<i of (31) for R in (25), (26), yields the IC-MMSE 
solution mi and the corresponding MSE erf, respectively. Note that the IC-MMSE filter bank is a more general 
solution than a MMSE filter bank for removing channel effects in a MIMO scenarios. After IC-MMSE filtering the 
soft estimate of the ith branch is 

Xi = Xi + hi (32) 
with hi ~ A/"(0, of), or written in a matrix form as 

x = x + n. (33) 



B. Belief propagation detector for lattice code based on Tanner graph representation 

After IC-MMSE equalization, the soft estimate x of a lattice point is obtained. Recall that in lattice space-time 
schemes, the codebook of transmitted vectors a; is a lattice code C(A,uq,TZ), where the generator matrix of A is 
TB. For simplicity, bet B be a generic lattice generator matrix. Lattice detection is to either decide which lattice 
point inside the shaping region has the minimum distance to x, or calculate the soft information (e.g., in the form 
of probability or log-likelihood ratio) about each candidate lattice point. The first detection criterion leads to hard 
decision detectors — e.g., maximum likelihood (ML). The second decoding criterion leads to soft decision detectors, 
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which can be used in iterations between detection and decoding. In this section, a novel Tanner graph based lattice 
decoding algorithm is introduced. For simplicity, assume an m-dimensional lattice code, i.e., x £ M. m . 

The novel lattice decoding algorithm introduced below relies on Tanner graph representations of lattices [13], 
which are enabled by lattice partitioning; all lattice points (those inside the shaping region are of interest) are 
partitioned into several subgroups (cosets). Each subgroup includes several different lattice points, and is labelled 
by a well-defined Abelian group block codeword. Then, a reduced-complexity soft-output lattice detector can be 
obtained by operating on the smaller number of cosets instead of lattice points. The labels of all cosets form an 
Abelian block code, which can be represented by a Tanner graph similar to low-density-parity-check (LDPC) codes. 
Belief propagation on a lattice is performed on its non-binary label Tanner graph to yield the total and extrinsic 
APP of the labels and their coordinates, as described in the following subsections. The APPs of individual lattice 
points are obtained in a final step described in Section III-D. 

A somewhat subtler point is that lattice partitioning revolves around an orthogonal sublattice A' of A, and the 
quotient group A/ A'; |A/A'| is finite iff A and A' have the same dimensionality. The most straightforward way of 
obtaining A' is by G-S orthogonalization of A's generator matrix, whereby all orthogonal G-S directions intercept 
A and the intersection naturally forms a sublattice of the same dimensionality as A; in all other cases the orthogonal 
sublattice will have to be obtained by some means other than G-S orthogonalization. 

1) Gram-Schmidt (G-S) orthogonalization: Given a generator matrix B = [&i . . . b m ], obtain a set of orthogonal 
vectors {w,}™^ 4 Let W z denote the vector space spanned by w t , i.e., W t = aw t , Va G K; S = {WJ™ x is a 
coordinate system. 

def 

2) Lattice label groups G;/ Let Pyy t (A) be the projection of A onto the vector space Wi, and Aw { = A D W,. 
The quotient group (A)/A^ is called a label group G;; A is now partitioned into a finite set of cosets labeled 

def 

by n-tuples from G = G± X . . . X G m . The (finite) set of all label ?i-tuples, denoted L(A), is called the label code, 
and uses G = G x x . . . x G m as its alphabet space. 

def 

3) Lattice label code L(A): Due to the isomorphism G; = Z 9i , with <?j = \Gi\, let G = Z gi x • • • x Z 9m . 
A lattice point will be labeled by the label of the coset to which it belongs. The label code L(A) is an Abelian 
block code. Let I = [h . . . l m ] T denote a label, and A(f ) denote the set of lattice points sharing the label I; clearly, 
labeling is invariant to translations of A by Uq. Let L(A), L(C(A 1 Uq 1 TZ)) denote the label codes of A, and of the 
subset of translated lattice points inside a shaping region 1Z, respectively. Then, a translated lattice point inside 1Z 
will have a label I e L(C(A,u ,1l)). 

def 

4) Finding a set of generator vectors V* = {v*Y^ =l for the dual label code L(A)* of A's label code L(A) [13]: 
The generator vectors {v*}f =1 characterize the lattice A like a parity check equation characterizes a linear block 
code, and have the following property: all the labels in L(A) are orthogonal to every vector v t in {f*}" =1 , i.e., 



v* L(A) = mod \cm(g 1 ,g 2 , ■■■ , g m ) 



(34) 



4 Essentially, w\ = b\,Wi = bi — £}=i t i ij w j: i = 2, • • ■ , 
product. 



m, where /Li;, = < bi,u>j >/< Wj,wj >, and < •, ■ > denotes inner 
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<':i 



3h + l 2 + 5l 3 + 3i 4 = mod 6 | l 2 + 2l 3 = mod 3 | k + k = mod 2 
Fig. 1. Tanner graph for Example 3, following [13]. 



where lcm(-, ...,•) is the least common multiple. 

5) Lattice Tanner graph: The generator vectors {v* }™ =1 act as check equations for the label code L(A), according 
to (34). Each coordinate of a label I corresponds to a variable node, and each generator vector that defines a check 
equation involving several label coordinates corresponds to a check node. A Tanner graph is constructed according 
to the constraints placed on label coordinates by the generator vectors {v*}f =1 . In general, the check equations 
are not over GF(2), unless the cardinalities of the label groups Gi are all two. Thereby, the TG of a lattice is, 
generally, non-binary. 

Example 3: (A = D4) A checkerboard lattice in R 4 , denoted D4, has a matrix generator: 



1112 
10 10 
110 
10 



(35) 



The associated Gram-Schmidt vectors are 



wi = [1, 1, 0, V 

w 2 = [ 1/2, -1/2, 1, f 
^3 = [ -1/3, 1/3, 1/3, 1 ] T 
^4 = [ 1/2, -1/2, -1/2, 1/2 ] T 

In the coordinate system {W 7 ^} 4 ^ = span{wi} i=1 , we obtain the following projection and cross-section: 

A Wl = V2Z- U 

Aw 2 = 



(36) 



P Wl (A) = 
Pw 2 (A) = 

*W S (A) = 



z 






kill 


z 


W2 




\w 2 \\ 


z 


w 3 



\Wl\ 

w 2 



P Wi {A) = Z 



\/3 ||w 3 | 
W4 



Aw, = 2V3Z 



W 2 \\ 
W 3 



\w 3 \ 



\w 4 \ 



A 



2Z- 



W4 
\w 4 \ 
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This results in the following quotient groups for D A : Gi(A) = jo, ^ j, G 2 (A) = |o, ^, ^, 

G 3 (A) = |o,^,^,V3,^,^}, G 4 (A) = {0,1}. The label code and the dual label code L(A),L(A)* C 

Z2 x x Zj x Z2 are, respectively [13], 

L(A) = {0000, 0031, 0220, 0251, 1300, 1331, 
1520, 1551, 1140, 1111 0440, 0411}, 

L(A)* = {0000, 0240, 0420, 1511, 1300, 1331, 
0451, 1540, 1151, 0031 1120, 0211}. 

The generator set for L(A)* is V* = {1151, 0240, 0031}. Since lcm(gi, g 2 , 33, 54) = 6, the TG of label code L(A) 
can be constructed accordingly, as given in Fig. 1, where Vj is the j-th check node, and U is the z-th variable. The 
variable nodes associated with generator vector v* are connected to vf, e.g., check node V\ is connected to all four 
variable nodes, because all variable nodes are involved in the first check equation. 

6) Non-binary belief propagation [14]; (x) denotes the projection of x, which may not be in A, onto vector 
space Wi, i.e., Py/^x) = x T Wi/\\wi\\. In the lattice Tanner graph a value a £ {0, 1, ...,<& — 1} of the variable 
node li is associated with the hypothesis that x is an observation of a lattice point whose label has i-th coordinate 
equal to a (or, whose projection on the vector space Wi belongs to coset with label a); Pr(7; = a) is the probability 
of this hypothesis. 

Define messages and where the subscripts i,j refer to i-th variable node and j-th check node Vj, 
respectively. The quantity is the probability of the hypothesis that x is an observation of a lattice point whose 
label has i-th coordinate equal to a, given the information obtained via check nodes other than Vf, is the 
probability of check Vj being satisfied given that x is an observation of a lattice point whose label has t-th 
coordinate equal to a. The message passing is [14]: 

r*= £ II 4> ( 3? ) 

1 e l{K), keM(j)\i 

v* T l = 

h = a 

k€M(i)\j 

where Kji are so that ^ Q q^ = 1, Af(j) is the set of variable nodes involved in check equation Vj, and A4(i) is 
the set of checks nodes connected to variable node If, f" is the initial probability of event li = a given observation 
x. 



C. Initializing the lattice Tanner graph 

Belief propagation requires initializing /" for the TG; this can be done in either projection domain or probability 
domain. After partitioning the infinite lattice into finitely many labeled cosets, not all labels are used by the points 
inside the finite shaping region; due consideration must be given to this aspect. 
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1) In projection domain: The soft estimate x obtained from the LMMSE filters bank is projected onto vector 
spaces {Wi}™ l (see Fig. 2); in general, f" is initialized as: 

(1) . MleL (C(A, u ,K)), find closest AgTJiI {A(Z) + u }: 

A min (Z) = argmin AeA(Z) J^Zi - PwA^)? (39) 

(2) . Calculate the probability of (subgroup with) label Z: 

exp - E i= i 2<T ? ) 
Pr (0 = 7 ' — t4 ;— r~ (40) 

2-sleL(C(A,U ,K)) eX P ^ 2^i=l ) ' 

With d 4 (A min (Z)) = \PwM - PwMmm(l))\> and *i of (26). 

(3) . Initialize /f from Pr(Z): 

/f = El :ll =„Pr(0- (4D 

Then is initialized to ff. The belief propagation algorithm is implemented by updating and iteratively 
until a predetermined number of iterations is achieved. 

Remark 1: (Simplified initialization) One can examine x along each Wi separately — no precaution taken to 
verify that selecting the closest projection coordinate in each direction, in isolation from other directions, yields 
collectively a point inside the shaping region. 

(1) . VZ, the minimum distance d,;(Z) along Wi is 

di(l) = argmin AeA(i) \P Wi {x) - PwA x )\- (42) 

(2) . Calculate the probability of subgroup with label Z via 



Pr(Z) = 



C M-L, l= i -far, 



^leL(C(A,U ,n)) 2-,i=l ex Pv — ~~2^f> 



Lastly, ff is initialized according to (41). This approach is referred to as simplified initialization, which is less 
complicated than the previous one — hence a slight performance loss. 



O : Coset 1 



X : Coset 2 



-e- 



-x — e-» — x — e 



Lower boundary Pw (x) Upper boundary 



Fig. 2. Illustrative projection of a point x £ M m on one of the orthogonal directions Wi, i = l,...,m, whose label group has cardinality 
|G«| = 2. 
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2) In probability domain: Given the soft estimates in x, the likelihoods of each coordinate 5 of x £ A at the 
fc-th MIMO channel use are calculated from the soft estimates in x 6 : 

P(xi\ Xi =c>)= A'cxp (-||^ - d\\ 2 l2o}) , (43) 

where c J is the j-th real coordinate x t of x £ A n 1Z. Then, the likelihood of each value of coordinate Xi at the 
fc-th MIMO channel use will form the component Pk (o 3 ; J) of a vector input Pk (c; I) to a SISO APP module, 
following the model and notations in [16]; as in [16], Ci will denote a random process enacted by a sequence of 
(coordinate) symbols taking values from some alphabet {cP\j £ J} — which nonetheless may be nonbinary, i.e. j 
is from a set of cardinality |J7~| > 2. 

D. Computation of extrinsic APP — either (lattice) point-wise or coordinate-wise — after belief propagation 

In order to implement iterative receivers it is necessary to compute the a posteriori probability at the end of belief 
propagation. After the last iteration, the belief propagation returns r™, and q" i} \/a,i,j. Then, the total a posteriori 
probability Pr(Z; = a) is computed as 

Prft = a)=/fn ieM i)^, (44) 

and the total a posteriori probability of each label is given by 

Pr(Z = {a u a 2 , • ■ ■ ,a m }) = n™i Pr ft = <**)■ (45) 

In Appendix I it is shown that when a lattice is represented by a Tanner graph, it is possible to associate a 
Markov process with the model for soft detection of lattice points, as shown in Fig. 3; also, that the extrinsic APPs 
P® p (cJ;0) and P^ p (u-';0) after belief propagation, corresponding to the fc-th transition between states, can be 
computed as: 

xUtui^Pk^ieyj], (46) 

Pk P V',0) = E e: ^(e)= u ,Pr(^(e))a™l ; ^^k(e);/] 

xlEiftl^M, (47) 

where i s s( e ) is the label indexed by the integer value of the starting state s (e) of edge e. Pk[u t (e); I] and 
Pk(c 1 (e); I) are the a priori probabilities of an unencoded, respectively encoded, symbol element (in this case 
a coordinate 7 ) at position i, which are associated with edge e [16]. In a serial concatenation such as in Fig. 4, 
the unencoded symbol elements are assumed to be identically distributed according to a uniform distribution, and 

5 A real coordinate of a lattice point, not an integer coordinate of a label. 

6 The subscript k, which would indicate the time index of the relevant MIMO channel use, is omitted here and in Fig. 4 for simplicity of 
notation. 

7 I.e., not necessarily a binary symbol, or bit. 
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Pk[u l (e); I] is the reciprocal of the alphabet size at position i. Pfc(c ? (e);7) are the likelihoods of lattice point 
coordinates, which can be computed as in the Tanner graph initialization step. 




Fig. 3. State transition diagram for Markov process representing a sequence of lattice points. Edges occur in clusters because every label 
generally covers more than one point in the shaping region. States are label indices; the state at any time is the index of the label that contains 
the most recent lattice point output by the Markov source. When the Markov source outputs a new point it transitions into the state indexing 
the label that contains the new point. 



IV. Application to the detection of super-orthogonal lattice space-time code 

Consider the superorthogonal space-time code [2], [3], [4], [5], [6], [7] as the MIMO transmission scheme. The 
decoding algorithm developed in the previous section combined with hypothesis testing is introduced as an efficient 
MIMO detector. 

A. Receiver for quasistatic scenarios 

Consider the superorthogonal space-time code given in Example 2. The ML receiver for x§ is given by 

a; ffi; ML = arg min \\y - H e a; e || 2 . (48) 

v x 9 

The ML receiver is usually computationally complicated since it needs to examine all valid lattice points (complexity 
grows exponentially). The algorithm introduced in Section III offers a computationally efficient solution. 

Recall that for a superorthogonal space-time code (see Example 2), either all \i or a U XI are zeros, which 
identifies two hypotheses: hypothesis Hi is that x/ are all zeros, and the base matrices C are chosen; hypothesis 
H2 is that xi are all zeros, and the base matrices C' are chosen. When hypothesis Hi is true, the transmission 
model (19) can be simplified as 

y = H%x + n. (49) 
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Inner decoder (IC-MMSE) 



ICi 



\-Hxl 



FIR, m 
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t -Hxi 



FIR,mf 



IC„ 



\-Hxl 



FIR,m J „ 



{ X a} 



{x a = 0} 



Outer decoder 



{£1} 



{**} 



{Xm} 



Likeli- 
hood 



Initialization 



P(c 


I) P(c; 


0) 




SISO 




P(u 


I) P(u 


O) 




TTcj-1 — ►P(c;J) P(qO) 
BP 

P(u; I) P(u; O) 



Soft 
Estimator 



Fig. 4. The block diagram of the novel iterative receiver for super-orthogonal space-time lattice code in the presence of coordinator interleaver. 



When hypothesis H2 is true, we have 

y = H% X ' + n. (50) 

Due to the orthogonality of matrices if^, k = 1,2, the MMSE filters for x-> x' 316 the corresponding matched 
filters 

M fc = - (i/©) H , fc= 1,2 (51) 

where M fc are MMSE filters for hypothesis H^. The output of MMSE filters for hypothesis Hi and H 2 are then 
given by 

X = M 1 y=-(Hl) U y = X + n 1 (52) 

a 

X' = M 2 y = ±(ff|) H y = x' + n 2 (53) 

a 

where n 1 and n 2 are estimation noise after filtering for hypothesis Hi and Hi, respectively. It is not difficult to 
see that h k ,k =1,2 are white multivariate Gaussian random vectors, i.e.,n fe - W(0, f^I). It should be pointed 
out that the IC is not necessay for this scenario and the estimations of (52), (53) are interference-free estimates of 
X and x' \ respectively, due to the orthogonality of H^. 
The probability of hypothesis Hi given y is: 

Pr(Hi\y) = Y, x ?r(Hi,x\y)- (54) 

In (54), summing over all valid values among x becomes infeasible as the length of x increases. In order to reduce 
the complexity, use the term that has the maximum value to approximate the summation (54). That is, 

Pr(-Hi|y) ~ max x PT(Hi, X \y) ~ p(y\H 1 m ,x ma x) ( 55 ) 
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with 

Xmax = arg ma,x x p(y \H& , x) = arg min x \y - H^x\ 2 

= argmin x |x - X? = sign (x) (56) 

where X is the output of the LMMSE filtering for hypothesis Hi and is given in (52). Similarly, 

Pr(H 2 \y) a max r Pr(tf 2)X '|y) ~ p(y\Hl, X LaJ (57) 
= argmin x - \y - H%x'\ 2 = sign (x )■ (58) 

The log likelihood ratio of hypothesis Hi and is 

Pr(iii|y) p(y|if^,x m ax) 

M-H) = log - , „ i , ~ log ? 

S Pr(tf 2 |y) p(y\Hl, X ' max ) 

= ^(\\y- Hlx' max \\ 2 - \\y - KXmaJf) 

— T7~ {y H ®Xmax ~ 2/ H ®Xmax) 

4a 2 /_ H ~,H , \ 

(59) 



7V 

Substituting (56) and (57) into (59) yields 

L(H) = (ABS (x) - ABS(x')) 4« 2 /^o (60) 

where ABS(a) = |a*|. Consequently, the probability of hypotheses Hi, H2 can be obtained from L(H) 

Pr(H k \y) = 1/(1 + ex P ( T L(H))), k = 1,2. (61) 

For each hypothesis one can apply the lattice detection algorithm developed in Section III for detecting X- We treat 
the information-bearing vector x as a lattice with generator matrix B, i.e.,x = Bu. For example, the equivalent 
model for detecting lattice point x is X = Bu + h , where x is the output of matched filtering of hypothesis 
Hi. Since x is from a D4 lattice, its generator matrix B is given in (35). The APPs can be obtained according to 
Section III. 



B. Iterative receiver for coordinate interleaving in fast fading 

Coordinate interleaving, along with the outer iteration loop in Fig. 4, is now considered; the real and imaginary 
parts of all complex symbols in a frame are collectivelly scrambled before transmission [18]. Y = {yi,y 2 , ■ ■ • , Vn} 
denotes a frame spanning N MIMO channel uses at the MIMO channel output (before deinterleaving). Note that 
the structure of the superorthogonal lattice code is removed during transmission, and has to be recovered before 
detection. The applicable receive equation is (6) rather than (19); the iterative IC-MMSE attempts to iteratively 
remove the cross-antenna interference, i.e. to undo the channel H on a per MIMO channel use basis. During the 
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first iteration, the soft feedback from the detector/decoder is null. The output of IC-MMSE is always deinterleaved, 
thus restoring the superorthogonal structure and yielding the soft-output X = {xi,X2, ••• ,xjy} with 

*t= r %+"t- ( 62 ) 

Since the information-bearing vector X®-t i s a direct sum of two D4 lattices, and the effective channel gain matrix 
r is unitary, the equalization approach in Section IV-A applies to eq. (62). Pi(Hk\xt), k = 1,2, are associated 
with the following transmission models upon removing r 1; T 2 respectively: 

H l : Xt = Bu t + n\ (63) 

H 2 : X 't = Bu' t + h 2 t (64) 

where Xt = h^x^u Xt = \^2&ti = j^i^t an d = i-^rij. The generator matrix B is given in (35). For 
each hypothesis, the lattice decoding algorithm can be applied to compute the extrinsic APPs P(u; O) and P(c; O). 

Inner-loop iterative decoding between SISO and BP, as shown in Fig. 4, can further improve the overall per- 
formance, especially in the presence of forward error correction coding, when decoding follows detection. Herein, 
only an uncoded system is considered in order to illustrate the concept. Even in an uncoded system it is possible to 
perform inner loop iterations between P BP (c; O) from the belief propagation module and P(u; I) from the SISO 
block; more benefit is derived however when a decoder is part of the inner-loop. 

V. Simulations 

Simulation results for a superorthogonal space-time lattice code with 4PSK constellation (Example 2), in both 
quasistatic and fast fading channels, are discussed. Each half of the superorthogonal constellation belongs to a D4 
lattice, implicitly defining a shaping region; only six of the twelve L(A) labels listed in Example 3 (first four, last 
two) are needed to cover the lattice points in the shaping region. In order to test the algorithm's efficiency, only 
the most likely label (or two labels) — post belief propagation — are retained; the others receive zero probabilities 
(re-normalization is performed after setting to zero the probabilities of discarded labels). 

A. Quasistatic fading 

The channel is constant over T = 2 symbol periods. In our simulations, each data packet includes 500 super- 
orthogonal codewords. Each point on the curves plotted in Fig. 5 and Fig. 6 is obtained by testing 2000 independent 
data packets. 

Fig. 5 shows the FER (frame error ratio) 8 vs. E^/Nq for super-orthogonal space-time code when the coordinate 
interleaver is absent. QPSK modulation is employed and the channel spectral efficiency is 2.5 bits/channel use. The 
performance of the ML algorithm that exhaustively searches all possible valid codewords and picks the one with 
the ML is plotted as reference. For the MMSE-BP algorithm, we run one iteration for the Tanner graph and collect 

8 One frame is meant to be one super-orthogonal space-time codeword 
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Fig. 5. FER vs. E^/No', super-orthogonal space-time lattice code, MMSE followed by BP. ML and BP with one and two surviving labels are 
identical. The curve with square markers illustrates the effect of simplified initialization. 

the probability of the coordinate of label. Then we consider choosing one surviving label and two surviving labels. 
The simulation result shows that the MMSE-BP algorithm with one surviving label and two surviving labels have 
the same performance as that of the ML algorithm. The MMSE-BP with simplified initialization that reduces the 
overall complexity is also examined. In this case, we consider two surviving labels, the results show that it is about 
0.5 dB away from the ML performance in low SNR region. As SNR increases, the MMSE-BP with simplified 
initialization approaches the ML performance asymptotically. 

B. Fast fading 

Fast fading simulations include a coordinate interleaves In our simulations, a depth-eight traditional block 
interleaver is considered. QPSK is used and the channel spectral efficiency is 2.5 bits/channel use. Two inner 
iterations are run between the SISO block and the BP block; one iteration is run on the lattice Tanner graph 
inside the BP block. We simulate different scenarios where different number of surviving labels are considered. 
In addition, iterative interference cancellation scheme is considered to improve the overall performance. The soft 
estimator computes the soft estimates of the coordinates of lattice point based on the output from the BP ( P(u; O)). 
Fig. 6 shows the FER vs.Eb/No for different number of surviving labels and different number of iterations between 
the IC-MMSE and the outer decoder. 

VI. Conclusion 

A soft-output closest point search in lattices was introduced, via a form of belief propagation on a lattice. Due 
to the coding gain associated with a lattice, structural relations exist between certain lattice points, which can be 
associated via an equivalence relation for detection purposes. This leads to a soft-output detection algorithm, which 





-e- ML 
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QF w/CI: MMSE+BP(2 Ibl), n |c =4 iter. 
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Fig. 6. FER vs. E^/Nq of iterative decoding based on IC-MMSE plus BP for super-orthogonal space-time lattice code with coordinate 
interleaves 

can generate both total and extrinsic a posteriori probability at the detector's output. The step-back feature of classic 
sphere decoding is eliminated. 



Herein, the expressions for extrinsic a posteriori probabilities (46), (47), at the belief propagation detector's 
output, are derived; the extrinsic probabilities are needed in iterative receivers. Here, the goal of detection is to 
provide soft information about valid channel alphabet symbols, i.e. real coordinates of the complex symbols from 
the modulation constellations used on various transmit antennas; this information about coordinates can be used 
to revert the effect of a coordinate interleaver, or can be forwarded directly to a soft decoder for some coded 
modulation encoder. Alternatively, it can be used for soft or hard demodulation, e.g. in the case of bit interleaved 
coded modulation, or with plain uncoded transmission. 

When a lattice is represented by a Tanner graph, it is possible to associate a Markov process with the model 
for soft detection of lattice points in a natural way. This is enabled by first viewing the sequence of lattice points 
passed through the channel as a Markov source. Another observation is that, in general, simple detection (with or 
without soft information) is by itself memoryless; thereby, one should expect the Markov process to be somehow 
degenerated, in order to reflect the memoryless nature of simple (non-iterative) detection. The objective of detection 
is to determine the aposteriori (total or extrinsic) probabilities of the output of the Markov source. In order to 
leverage off of known results — even in the case of plain, unencoded transmission (no forward error correcting 
redundancy added by encoding) — one can view the output c of the Markov source (a lattice point, i.e. a vector of 
lattice coordinates) as the result of mapping with rate one (i.e. no additional redundancy) an identical replica of 
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the input u = c; this is a degenerated Markov process where even the dependence of the future on the present is 
removed. The only remaining structure to be captured for the Markov source, in the case when the candidate points 
are from a lattice, must reflect the partitioning in labeled cosets, as discussed in Section III-B. To this end, note 
that the labels themselves can be associated with states having integer values by virtue of the following convention: 
the state Sk-x at time — 1 is the index of the label that contains the most recent lattice point output by the 
Markov source, i.e. at time k — 1; when the Markov source outputs a new point at time k it transitions into state 
Sk equal to the integer indexing the label that contains the new point. Alternatively, with respect to the mapping 
u i — ► c and omitting the time index, when u = A E A occurs at the rate-one block input, the Markov process 
transitions into the state whose (integer) value indexes the label containing A. This is represented in Fig. 3, where 
e denotes an edge between starting state s (e) and ending state s E (e). Formally, for any edge e, at any time, if 
it(e)=A€ A(li) C A, where te{l, . . . |L(A)|} indexes one of the |£(A)| labels, then the ending state s E (e)~i and 
the Markov source outputs c(e) =it(e). There is a bijective mapping I between integer states and labels s i— > l s 

def 

such that, for any integer state sG{l, . . . |L(A)|}, i(s) = l s is the label associated with s. 

The Markov sequence of random points selected from the lattice can be thus viewed as triggered by state 
transitions triggered by u = A E A; although the realizations of u on the lattice grid are random, a state model 
arises as a result of partitioning the lattice in equivalence classes. That is, there exist certain structural relations 
between certain points, which can be associated via an equivalence relation. The state probabilities, used in a 
posteriori probability calculations, are seen to be associated with the probabilities of these equivalence classes (or 
their labels), which can be obtained separately from belief propagation on the lattice's Tanner graph, as shown next. 

In general, for a Markov process generated by triggering state transitions via some input (e.g. a classical 
convolutional code), the new state depends on the current input and several previous inputs; in the case at hand the 
new state depends only on the current input. This illustrates the degenerated nature of the Markov process at hand, 
seen thereby to be memoryless. 

The memoryless nature of the Markov process is also apparent in the fact that any state can be reached in one 
transition from any state, and the probability distribution of the states does not depend on time; it depends only on 
the probability distribution for u, and so does the probability distribution of the output of the Markov process. The 
output of the Markov process does not depend on the current state, but rather on the input u; the input determines 
both the new output and the new state, which implies that the output any time does not depend on any previous 
state. 

The remainder of this appendix will use the state transition diagram in Fig. 3 for the Markov process that 
forms the object of detection; the results in [16], [17] apply. Following [16], the extrinsic APPs P^ p (c- J ;0) and 
P]? F (it-' ; O) during the fc-th transition between states have the general expressions 



n B V;0) 



£e:C£ ( e)=ci A k -x[s S (e)] Utx ^ A 



(65) 
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xnl=i^[^(e);/]S fc [.s B (e)], (66) 

where j4fc_i[s s (e)] and i?fc[s B (e)] are the probabilities of the current state and the new state that are associated 
with edge e. Following the well-known results and notation in [17] and using the memoryless nature of the Markov 
process in Fig. 3, 

A k [s] = Pv{S k = s; yl) = Px{S k = s; y^yt 1 } 

= Pi{S k = s-y^-^Pxiyl- 1 } (67) 
= P r {S k = S ;y k }Pi{y k 1 - 1 }=P r {S k = S ;y k } Ko , (68) 

where, following [17], denotes the observations of the relevant Markov process, as taken at the output of 
a discrete memoriless channel at time instants 0,1,..., r. Most importantly, the factor ko does not depend on 
the state s, and is thereby cancelled out during the normalization step that enforces X^s^M = 1- Due to the 
isomorphism between states and labels it follows that Pi{S k =s;y k } is the label probability Pr(£(s)) = Pr(7 s ) 
calculated as in (45). From [17] and the properties of the degenerated Markov process, 

B k [a] = PrM +1 |S fe = s}= Pr{y T k+1 }, (69) 

which does not depend on the state s and behaves as a constant that is cancelled out during the normalization step 
enforcing £ s B k [s] = 1. Therefore (46), (47) follow. 
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